Using Multilingual Resources for Building SloWNet Faster

نویسنده

  • Darja Fišer
چکیده

This project report presents the results of an approach in which synsets for Slovene wordnet were induced automatically from parallel corpora and already existing wordnets. First, multilingual lexicons were obtained from word-aligned corpora and compared to the wordnets in various languages in order to disambiguate lexicon entries. Then appropriate synset ids were attached to Slovene entries from the lexicon. In the end, Slovene lexicon entries sharing the same synset id were organized into a synset. The results were evaluated against a goldstandard and checked by hand.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

sloWNet: construction and corpus annotation

This paper presents a wordnet for Slovene which was created semi-automatically with a combination of approaches and multilingual resources, in particular a bilingual dictionary, a parallel corpus and Wikipedia. Analysis of the results shows that the dictionary approach yields a good core wordnet but requires substantial manual editing due to a lack of automatic word-sense disambiguation. This w...

متن کامل

Building a Chinese-English Mapping between Verb Concepts for Multilingual Applications

This paper addresses the problem of building conceptual resources for multilingual applications. We describe new techniques for large-scale construction of a Chinese-English lexicon for verbs, using thematic-role information to create links between Chinese and English conceptual information. We then present an approach to compensating for gaps in the existing resources. The resulting lexicon is...

متن کامل

Visualizing sloWNet

With the increasing popularity of semantic lexica such as wordnets that are being developed for more and more languages the need for tools which enable displaying and management of their content has risen as well. Dictionary writing systems or tools for ontology management are not suitable for use with wordnets because they are concept-based and relational on the one hand but less formal and mo...

متن کامل

Building Specialized Multilingual Lexical Graphs Using Community Resources

We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users’ behaviors to extract interesting patterns and facts (implicit approach). As a ...

متن کامل

Multilingual Grammar Resources in Multilingual Application Development

Grammar development makes up a large part of the multilingual rule-based application development cycle. One way to decrease the required grammar development efforts is to base the systems on multilingual grammar resources. This paper presents a detailed description of a parametrization mechanism used for building multilingual grammar rules. We show how these rules, which had originally been des...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008